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Error-Bounding  in  Level- Index  Computer  Arithmetic 

D.  W.  Lozier  and  P.  R.  Turner 

Applied  and  Computational  Mathematics  Division 
National  Institute  of  Standards  and  Technology 
Gaithersburg,  MD  20899 

Mathematics  Department 
United  States  Naval  Academy 
Annapolis,  MD  21402 

Abstract.  This  paper  proposes  the  use  of  level-index  (LI)  and  symmetric  level-index 
(SLI)  computer  arithmetic  for  practical  computation  with  error  bounds.  Comparisons 
are  made  with  floating-point  and  several  advantages  are  identified. 

1 Introduction 

Any  approach  to  the  general  problem  of  assessing  the  total  error  in  the  output  of 
computer  programs  depends  on  a detailed  understanding  of  the  computer  arithmetic. 
The  finite  precision  of  the  arithmetic  gives  rise  to  rounding  errors  that  can  be  an 
important  component  of  the  total  error.  Accordingly,  much  effort  has  gone  into  refining 
the  algorithms  and  circuitry  that  carry  out  floating-point  arithmetic.  One  goal  of  this 
effort  has  been  to  minimize  rounding  errors.  Another  was  to  ensure  that  exceptional 
conditions,  such  as  underflow  and  overflow,  are  detected  and  reported  because  their 
occurrence  can  completely  invalidate  the  results  of  a computation.  The  present  state 
of  floating-point  hardware  design  [5]  is  close  to  optimal,  and  so  the  question  arises:  Is 
there  a radically  different  system  of  arithmetic  with  properties  that  are  superior? 

An  answer,  proposed  a little  more  than  ten  years  ago  [l]  and  known  as  level-index 
arithmetic,  is  based  on  representing  positive  numbers  by  generalized  logarithms.  These 
representations  are  obtained  by  repeatedly  taking  logarithms  until  a result  between 
zero  and  unity,  the  index,  is  obtained.  The  corresponding  level  is  the  number  of 
times  the  logarithm  was  taken.  The  level  (an  integer)  and  the  index  (a  fraction)  are 
added  and  stored  in  a fixed-point  location  as  the  internal  representation  of  real  positive 
numbers.  This  describes  the  unsymmetric  form  of  level-index  arithmetic.  There  is  also 
a symmetric  form  [3]  in  which,  effectively,  real  numbers  less  than  unity  in  magnitude 
are  reciprocated  before  being  stored. 

The  main  purpose  of  this  paper  is  to  compare  the  new  arithmetic  against  the  old,  par- 
ticularly in  regard  to  interval  arithmetic  and  other  error-bounding  techniques.  Among 
the  advantages  will  be  (i)  an  immunity  to  extraneous  considerations  necessitated  by 
underflow  and  overflow;  (ii)  a unified  error  analysis  that  naturally  blends  absolute 
errors,  relative  errors,  and  higher-order  generalized  errors;  and  (iii)  a natural  means 
for  increasing  precision  when  needed  within  an  algorithm. 

The  representation  of  real  numbers  in  a computer  is  based  on  a mapping  of  the  form 

(1)  x^scn — . 


E e 7L  C 7^ 
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where  S and  7^  are  subsets  of  the  real  numbers.  7^  is  a finite  subset  associated  with 
computer  words  of  length  w bits.  Elements  of  7^  are  called  internal  numbers,  those 
of  S external  numbers.  Usually  5 is  a finite  interval  such  as  {—M,  M)  or  {M~^,  M). 

To  be  useful  in  representing  external  numbers,  the  mapping  should  be  invertible  but 
of  course  this  is  possible  only  in  an  approximate  sense.  Accordingly,  suppose  that 

(2)  » = /(JT) 

where  / is  an  approximation  to  a continuous  real  function  / that  is  invertible  with 
inverse  function  g.  Then  we  define  the  generalized  error  function 

(3)  generr(X,X):=l/(X)-/(l)| 
where 

(4)  X = g{x) 
is  our  external  approximation  to  X. 

Commonly  used  representations  are  fixed-point,  logarithmic  and  floating-point.  The 
fixed-point  representation  function  is  the  identity.  In  this  case  X = x and,  cissuming 
S = (—1,1),  X is  obtained  by  rounding  the  binary  expansion  of  A to  — 1 bits  (one 
bit  is  needed  for  the  sign  of  X).  The  generalized  error  is  just  the  absolute  error,  and 
the  inequality 

(5)  generr(X,  X)  = |X  - Xj  < 2"^ 
is  satisfied. 

For  the  binary  logarithmic  representation  on  S = with  M = 2^”^,  / is  the 

binary  logarithm  and  x = /(X)  is  logj  X rounded  to  w — m — 2 bits.  Then 

(6)  generr(X,  X)  = | logj  X - logj  X|  < 
where  X = 2^.  Since 

(7)  log2  X - log2  X = log2{l  + (X  - X)/X} 

and  similarly  for  log2  X — log2  X,  the  generalized  error  is,  to  within  terms  of  first 
order,  just  the  relative  error  times  l/ln2  = 1.44-  • -.  It  may  be  noted  in  comparison 
to  fixed-point  that  an  additional  sign  bit  is  needed  and  that  the  representation  of  zero 
is  special. 

Floating-point  may  be  viewed  as  a modification  of  the  logarithmic  representation. 
Suppose  X G {M~^,  M)  and  write 

(8)  log2  X = c(X)  + log2{l  + a(X)} 

where  c(X)  is  the  unique  integer  determined  by  the  condition  cr{X)  G [0, 1).  Here  <t(X) 
is  the  fractional  part  of  the  floating-point  significand  and  2'^^^^  is  the  scale  factor.  The 
representation  function  is  not  so  readily  expressed  as  for  the  logarithmic  representation, 
since  the  signs  of  c(X)  and  o'(X)  are  opposite  when  X < 1.  In  effect,  it  is  taken  as 
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f{X)  = a-{X)  with  separate  accounting  for  c(X).  The  internal  approximation  x is 
cr(X)  rounded  to  w — m — 2 bits,  and  in  the  usual  case  when  c(X)  = c(X), 

(9)  generr(X, X)  = |c7(X)  - a[X)\  < 

For  IEEE  arithmetic  in  single  and  double  precision,  m = 7 and  m = 10,  respectively. 

2 The  Level-Index  Representation 

The  generalized  logarithm  is,  by  definition,  the  function 

t if  0 < t < 1, 

1 + 'ip(\nt)  if  t > 1, 

— •0(— t)  if  t < 0. 

This  function  is  invertible  and  its  inverse  function  is  given  by 

t if  0 < t < 1, 

if  t < 0. 

Both  Tj}  and  4>  are  strictly  increasing,  continuous,  and  continuously  differentiable  on  R. 
If  t > 1,  'ip(t)  is  obtained  by  repeatedly  taking  logarithms  until  In^^^  t = Inin  - --Int  G 
[0, 1).  Then  In^^^t.  By  definition,  £ is  the  level  and  In^^^  t is  the  index  of 

t.  For  the  inverse  function,  ii  t = £ + a > 1 where  £ is  the  integer  part  of  t,  then 
= exp(^)(a). 

For  level-index,  or  LI,  computer  arithmetic  we  take  S = (—<^(8),  i^(8))  and  x = ip{X) 
where  x is  obtained  by  rounding  the  binary  expansion  of  ip{X)  to  w — A bits.  The 
following  table  supports  this  choice  of  S: 


(11) 


4>{t)  = < 


(10) 


V>(t)  < 


X 

<f>{x) 

X 

(l>{x) 

X 

(j){x) 

0 

0 

3 

15.15 ••• 

4.63 

21024  ^ 

1 

1 

4 

2Q6.58- 

4.80--- 

216384 

2 

2.72- •• 

4.40  • • • 

2^2®  w 10®® 

4.99- •• 

25502841  ^ 2Q1656520 

We  see  that  x = 4.40  corresponds  approximately  to  the  IEEE  standard  overflow  thresh- 
old in  single  precision  {w  = 32).  Overflow  thresholds  in  double  precision  (w  = 64) 
and  quadruple  precision  {w  = 128)  are  reached  at  x = 4.63  and  x = 4.80,  respectively. 
Allocating  3 bits  to  the  level  and  w — 4 to  the  index  (the  remaining  bit  is  the  sign 
bit)  allows  us  to  represent  numbers  in  the  vast  interval  (—^(8),  ^(8)).  Indeed,  <^(6)  is 
already  so  large  that  it  is  impractical  to  express  it  as  a floating-point  number. 

If  restricted  to  the  interval  (—1, 1),  LI  arithmetic  is  equivalent  to  fixed-point.  A 
symmetric  modification,  called  SLI  arithmetic,  is  more  analogous  to  floating-point. 
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We  take  S = (M“^,  M)  where  M = (f>{S).  The  representation  function  becomes 


(12) 


^(i)  = ■0(lnt)  = 


^(i)  - 1 

1 - 


ift  > 1, 
if  0 < t < 1 


with  inverse  function 


(13)  $(i)  = + 1). 

Again,  3 bits  are  allocated  to  the  level.  With  one  bit  each  for  the  signs  of  X and 
w — 5 bits  are  allocated  to  the  index. 

In  [6]  it  is  proved  that  LI  and  SLI  arithmetic  with  3 bits  allocated  to  the  level  are 
both  closed.  That  is,  all  sums,  differences,  products,  and  quotients,  excluding  division 
by  zero,  of  numbers  in  7^  are  elements  of  S,  provided  only  that  w does  not  exceed  5 
million  bits  or  so.  Thus  the  rounded  result  of  an  arithmetic  operation  in  7^  is  again  in 
Tyj.  In  particular,  this  means  that  both  overflow  and  underflow  have  been  abolished 
for  the  basic  arithmetic  operations. 

The  generalized  error  for  LI  and  SLI  is  defined  by  (3)  and  (4)  with  /,  g replaced  by 
'll},  <f)  and  $,  respectively: 


(14) 


generr(X,  X) 


|V»(A')  - V'(X)|  < 2^-^  for  LI, 
|^(X)-'y?(X)|  < 2'^-’"  for  SLI; 


cf.  (5),  (6)  and  (9).  At  level  1,  when  1 < A”  < e,  the  generalized  error  is  the  relative 
error  in  the  external  approximation  (to  within  terms  of  first  order).  The  behavior  of 
relative  error  for  A > e is  the  subject  of  the  next  section. 

3 Representation  Errors 

For  a given  computer  arithmetic  the  generalized  error  is  meaisured  in  the  set  7^  of 
internal  numbers:  it  is  bounded  uniformly  by  a small  constant  that  depends  on  to. 
For  fixed-point,  logarithmic  and  floating-point  arithmetics  the  generalized  error  has 
a familiar  interpretation  in  the  external  set  S:  the  number  of  either  ‘decimal  places’ 
(in  fixed-point)  or  ‘significant  decimal  digits’  (in  logarithmic  and  floating-point)  is 
uniformly  bounded.  There  is  no  familiar  interpretation  of  generalized  error  for  level- 
index  arithmetic.  Accordingly,  a comparison  in  familiar  terms  is  needed. 

Figure  1 presents  a comparison  of  SLI  against  IEEE  floating-point  for  -w  = 32  and 
S = [1, 10^^].  The  horizontal  scale  is  log^o-^  A € «S.  The  vertical  scale,  //(A),  is 
a measure  of  ‘significant  decimal  digits’  computed  by  evaluating  the  formula 

(15)  /x(A)  = - logio  ^ y ^ 
in  double  precision.  Here  the  equations 

(16)  ^(A+)  = ^(A)+2-2^, 


determine  A"*"  and  A. 


o-(A+)  = o-(A)  + 2-23 
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Figure  1.  Comparison  of  Figure  2.  Comparison  of 

SLI  against  IEEE  for  w = SLI  against  IEEE  for  w = 

32.  32,64,128. 

Figure  1 illustrates  differences  between  32-bit  IEEE  and  SLI  arithmetic.  First,  the 
IEEE  curve  exhibits  oscillatory  behavior  not  present  in  SLI.  This  is  due  to  the  phe- 
nomenon [4]^  known  as  wobbling  ■precision.  The  logarithmic  curve,  were  it  shown, 

would  lie  within  the  oscillatory  band  and  would  be  essentially  constant.  The  SLI 

curve  is  smooth  and  gradually  decreasing.  Second,  the  IEEE  curve  does  not  extend 
beyond  the  overflow  limit  of  10^®,  approximately.  Numbers  beyond  this  limit  have  no 
IEEE  representation  other  than  a generic  infinity  symbol,  whereas  SLI  retains  use- 
ful significance  far  beyond  the  limit.  Third,  the  two  curves  cross  at  approximately 
X = 2400.  Before  this  point  SLI  has  more  significance,  while  beyond  it  IEEE  does 
until  it  fails  at  the  overflow  limit. 

Figure  2 compares  32-,  64-  and  128-bit  lEEE^  and  SLI  over  the  range  S = [1, 10®°°^. 
In  the  IEEE  formats  the  overflow  limit  is  increased  as  w increases  by  extending  the 
width  of  the  field  that  holds  c(X);  cf.  (8).  The  field  widths  are  8,  11  and  15  bits, 
respectively.  Accordingly,  as  w increases  the  SLI  index  field  gains  more  bits  than 
the  IEEE  significand  field.  This  results  in  the  crossover  point  increasing  from  2400 
in  the  32-bit  format  to  approximately  10^®  and  10®^  in  the  64-  and  128-bit  formats, 
respectively.  In  computing  applications  that  involve  numbers  that  lie  mostly  to  the 
left  of  the  crossover  point,  the  relative  precision  of  SLI  should  exceed  IEEE. 

In  the  authors’  experience  instances  are  known  where  double  precision  is  used  in 
practical  computations  not  because  single  precision  is  too  inaccurate  but  to  avoid 

^The  phenomenon  occurs  for  radix  2 as  well  as  for  higher  floating-point  radices,  contrary  to  what 
is  claimed  in  [4] . 

^Strictly,  the  IEEE  standard  does  not  specify  128-bit  formats.  The  format  used  here  is  a plausible 
extension  that  has  been  used  in  commercial  computer  products. 
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overflow.  In  SLI  overflow  (and  underflow)  are  impossible,  so  the  precision  can  be 
chosen  solely  on  accuracy  requirements.  Another  advantage  is  that  the  field  for  the 
level  is  always  3 bits.  Without  a clear  mathematical  criterion  for  subdividing  the 
floating-point  word  into  its  two  constituent  subfields,  a variety  of  inconsistent  formats 
has  emerged.  This  is  still  true  even  after  widespread  adoption  of  the  IEEE  standard. 

4 Concluding  Remarks 

The  active  developers  of  LI  and  SLI  are  small  in  number  but  they  have  produced  a body 
of  literature  on  algorithms,  applications  and  error  analyses  some  of  which  is  contained 
in  the  1989  survey  [2].  This  reference  summarizes  the  recursive  algorithms  that  are 
used  to  perform  the  basic  LI  and  SLI  arithmetic  operations  in  fixed-point  arithmetic 
with  a small  number  of  guard  digits.  The  1995  paper  [7],  which  discusses  present  and 
planned  software  simulations,  lists  additional  references  in  its  bibliography. 

LI  and  SLI  possess  several  advantages  compared  to  floating-point.  Some  of  these  have 
been  introduced  in  this  paper.  Freedom  from  underflow  and  overflow  is  the  greatest 
advantage.  Error  analysis  in  terms  of  generalized  errors  may  appear  to  be  an  obstacle 
but  it  may  have  advantages,  for  example  in  appropriately  measuring  computational 
error  in  the  neighborhood  of  a zero.  This  possibility  will  be  taken  up  in  a future 
paper.  Finally,  in  contrast  to  floating-point,  an  increa.se  or  decrease  in  wordlength  to 
accommodate  changing  needs  for  precision  is  achieved  naturally  in  LI  and  SLI. 
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